Citi Bike is the one of the major transportation system in New York City. In a city known for heavy traffic and limited road space, bike-sharing provides a convenient, affordable, and environmentally friendly alternative to cars and public transit. As a result, Citi Bike is widely used by both residents and visitors for a variety of purposes, including commuting, short errands, and leisure.
While Citi Bike usage continues to grow, riders do not all use the system in the same way. Citi Bike users are divided into two groups: casual riders and members. These two rider types have differrent travel needs, motivations and usage pattern. Understanding how their differs is important for evaluating how well the system serves different users and how it can be enhanced.
In this project, I analyzed how rider type affect on ridership and how these differences can be used to improve urban mobility planning.
SQ.How does rider type influence both demand and riding patterns?
Our overarching question of this analysis investigate how multiple factors -including weather, time, location, bike lane infrastructure, and rider type - affect Citi Bike trip volume in Manhattan. We capture the complexity of urban transportation and recognizes that ridership is shaped by both environmental and customer behavior factors.
Specifically, rider type emerges as a particularly important dimension because it reflects fundamental differences in user intent and usage differences. Casual riders and members show different pricing structures, travel needs and usage motivations, which may affect not only how often they ride but also how they use the system.
With this reason, I focused on the specific questions of how rider type influences both demand and riding pattern. By narrowing the scope from the overarching question to rider type, my sub question allows for a deeper analysis of behavioral differences.
Data Acquisition
I used and selected Citi Bike data covering from October 2024 to October 2025.
I obtained Citi Bike trip data from the official Citi Bike site. https://citibikenyc.com/system-data The dataset includes montly trip record.
Data acquisiton and integration were performed using the tidyverse. I used list.files() to identify all montly Citi Bike trip data csv files and purrr:map_dfr() to read and row bind each file into a single merged dataset (citibike_all) for analysis.
Show code
library(tidyverse)files <-list.files("data/Course project",pattern ="citibike",full.names =TRUE)# Merge the files files
The final dataset includes 1,244,381 Citi Bike trips and 13 columns, covering the period from October 2024 through October 2025.
Data visualization
1. Duration gap between casual vs member
I examine the duration gap between casual riders and members to understand how usage behavior differs across rider types.
Using tidyverse packages,including dplyr, lubridate,tidyr, and ggplot2
First, I converted star and end timestapms to datetime format and computing trip duration in minutes. then I extracted temporal features, including hour of day and day of week, to capture time-based usage pattern.
Next, I calculated average trip duraton by ridr type (casual vs member), hour of day, and day of week.
Finally, I computed the duration gap as difference in average trip duration between casual and member riders and visualize using a heatmap.
Show code
library(dplyr)library(lubridate)library(ggplot2)library(tidyr)df <- citibike_all# Preprocess: duration + hour/daydf <- df %>%mutate(started_at =ymd_hms(started_at),ended_at =ymd_hms(ended_at),ride_duration_min =as.numeric(difftime(ended_at, started_at, units ="mins")),hour =hour(started_at),dow =wday(started_at, label =TRUE, abbr =TRUE) )# Average durationavg_duration <- df %>%group_by(member_casual, dow, hour) %>%summarise(avg_duration =mean(ride_duration_min, na.rm =TRUE),.groups ="drop" )duration_wide <- avg_duration %>%pivot_wider(names_from = member_casual,values_from = avg_duration )# Duration gapgap_df <- duration_wide %>%mutate(duration_gap = casual - member)# Heatmapggplot(gap_df, aes(x = hour, y = dow, fill = duration_gap)) +geom_tile(color ="white") +scale_fill_viridis_c(option ="magma",direction =-1,name ="Duration Gap\n(casual - member)" ) +labs(title ="Duration Gap Between Casual and Member Riders",subtitle ="Higher values indicate longer trips by casual riders",x ="Hour of Day",y ="Day of Week" ) +theme_minimal(base_size =15)
The heatmap shows a clear duration gap between casual riders and annual members across both time of day and day of week. Almost all periods, casual riders have longer average trip durations than members, suggesting different usage patterns.
The gap is smallest during weekday morning hours, which likely reflects commuting behavior shared by both rider types. In contrast, the gap becomes larger in the afternoon and evening, and is most pronounced on weekends, indicating that casual riders tend to use Citi Bike for longer, more recreational trips during these times
2. Monthly Trip Volume
To analyze montly citi bike usage pattern, I used dplyr and ggplot2 package.
I aggregated trip level data to the montly level by extracting the year-month from the trip start timestamp. For each month and rider type (casual vs member), total trip counts were computed to summarize overall usage volume.
I created bar a grouped bar chart to compare montly trip counts between casual riders and members. This chart makes it easy to see differences in usage patterns across rider types and shows clear seasonal changes in Citi Bike usage.
Show code
library(dplyr)library(ggplot2)# 1. Create monthly aggregated datamonthly_volume <- df %>%mutate(month =format(as.Date(started_at), "%Y-%m")) %>%group_by(month, member_casual) %>%summarise(trip_count =n(), .groups ="drop")# 2. Keep only data from October 2024 onwardmonthly_volume <- monthly_volume %>%filter(month >="2024-10")# 3. Order the month factor properlymonthly_volume$month <-factor( monthly_volume$month,levels =sort(unique(monthly_volume$month)))# 4. Minimal clean version (no background)ggplot(monthly_volume, aes(x = month, y = trip_count, fill = member_casual)) +geom_col(position ="dodge") +labs(title ="Monthly Trip Volume by Rider Type",x ="Month",y ="Trip Count",fill ="Rider Type" ) +theme_minimal(base_size =12) +theme(axis.text.x =element_text(angle =45, hjust =1),legend.position ="right",plot.title =element_text(size =16, face ="bold") )
The bar chart shows monthly Citi Bike trip volume by rider type from October 2024 to October 2025. Across all months, annual members consistently account for a larger share of trips than casual riders, highlighting the role of Citi Bike as a regular transportation option for many users.
At the same time, strong seasonal patterns are evident. Trip volume declines sharply during the winter months and increases steadily in the spring, reaching a peak in the summer. This seasonal effect is particularly pronounced among casual riders, whose usage rises sharply during warmer months, suggesting greater recreational and tourist-driven demand. Conversely, member usage remains relatively stable throughout the year, reflecting more routine commuting behavior.
Limitation
This study has several limitations. First, the anaylsis relies on obervational data, which prevents causal intepretation of differences between causal and member riders. Second, while strong seasonal patterns are evident, detatiled weather conditions were not explicitly include in the analysis. In addition, the dataset does not caputre trip purpose, requiring behavioral differences to be inferred from duration and timing. Finally, the analysis is limted to a one-year period and may not refelct longer-term trends in Citi Bike usage.
Conclusion
Based on my analysis, the results show clear behavioral differences between the two groups. Causal riders tend to take longer trips, especially during weekends and non-commute hours, while members show more consistent and shorter trip throughout the week.
One of the main finding is a strong seasonal patterns in Citi Bike usage. Overall trip volume increases during warmer months, with casual ridership driving much of the summer peak. While, member usage remains relatively stable across the year, suggesting that members primarily use Citi Bike for regular transportation needs.
These insights suggest that rider type plays an important role in shaping usage patterns and should be considered in future planning and policy decisions.